Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternate maxPolygonToCellsSize implementation #805

Merged
merged 9 commits into from
Dec 4, 2023

Conversation

nrabinowitz
Copy link
Collaborator

@nrabinowitz nrabinowitz commented Nov 29, 2023

Adds an alternate implementation of maxPolygonToCellsSize, based on a coarse run of polygonToCellsExperimental. This approach is slower than the current maxPolygonToCellsSize, but should accommodate the new modes and fix cases where we were previously underestimating the new polyfill algo around the poles. It generally provides a much better ratio of memory allocation to memory use.

Algorithm Overview

  • Initialize a polygonToCellsCompact iterator, without taking the first step.
  • Based on a rough estimate of the polygon bounding box area, choose a coarser H3 resolution that should give us no more than MAX_SIZE_CELL_THRESHOLD cells (threshold is currently set to 100, which is a little arbitrary)
  • Set the output estimate to 0.
  • Run the polyfill at the new resolution, using a fast new CONTAINMENT_OVERLAPPING_BBOX mode (like CONTAINMENT_OVERLAPPING, but checks the cell bounding box rather than the boundary, for about 2-3x perf improvement). For each output cell, add the count of its children at the target res to the output estimate.

Results

  • The new algorithm is slower than the old algorithm, which ran in constant time. However, it's still very fast - calculating max size for all 1619 country polygons at res 15 takes 469ms. We could probably make it faster by lowering the MAX_SIZE_CELL_THRESHOLD value, at the cost of precision. Benchmarks:
Res 0	-- maxPolygonToCellsSize_AllCountries1: 3207.800000 microseconds per iteration (100 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 7364.940000 microseconds per iteration (100 iterations)
Res 1	-- maxPolygonToCellsSize_AllCountries1: 5064.960000 microseconds per iteration (50 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 24727.380000 microseconds per iteration (50 iterations)
Res 2	-- maxPolygonToCellsSize_AllCountries1: 3322.272727 microseconds per iteration (33 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 64993.757576 microseconds per iteration (33 iterations)
Res 3	-- maxPolygonToCellsSize_AllCountries1: 5269.680000 microseconds per iteration (25 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 121757.000000 microseconds per iteration (25 iterations)
Res 4	-- maxPolygonToCellsSize_AllCountries1: 3385.500000 microseconds per iteration (20 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 190047.300000 microseconds per iteration (20 iterations)
Res 5	-- maxPolygonToCellsSize_AllCountries1: 5387.375000 microseconds per iteration (16 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 227737.000000 microseconds per iteration (16 iterations)
Res 6	-- maxPolygonToCellsSize_AllCountries1: 3489.357143 microseconds per iteration (14 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 284585.928571 microseconds per iteration (14 iterations)
Res 7	-- maxPolygonToCellsSize_AllCountries1: 5646.250000 microseconds per iteration (12 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 368341.333333 microseconds per iteration (12 iterations)
Res 8	-- maxPolygonToCellsSize_AllCountries1: 3677.545455 microseconds per iteration (11 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 456091.545455 microseconds per iteration (11 iterations)
Res 9	-- maxPolygonToCellsSize_AllCountries1: 5813.600000 microseconds per iteration (10 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 471468.700000 microseconds per iteration (10 iterations)
Res 10	-- maxPolygonToCellsSize_AllCountries1: 3742.333333 microseconds per iteration (9 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 475343.888889 microseconds per iteration (9 iterations)
Res 11	-- maxPolygonToCellsSize_AllCountries1: 5914.000000 microseconds per iteration (8 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 475462.750000 microseconds per iteration (8 iterations)
Res 12	-- maxPolygonToCellsSize_AllCountries1: 4042.571429 microseconds per iteration (7 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 469281.000000 microseconds per iteration (7 iterations)
Res 13	-- maxPolygonToCellsSize_AllCountries1: 6715.857143 microseconds per iteration (7 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 470467.571429 microseconds per iteration (7 iterations)
Res 14	-- maxPolygonToCellsSize_AllCountries1: 4498.166667 microseconds per iteration (6 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 471111.166667 microseconds per iteration (6 iterations)
Res 15	-- maxPolygonToCellsSize_AllCountries1: 6207.000000 microseconds per iteration (6 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 469306.833333 microseconds per iteration (6 iterations)
  • The new algorithm provides a better ratio of memory allocation to memory use in comparison to the old algo in most cases:
image image image

If it seems useful, I can try different values for MAX_SIZE_CELL_THRESHOLD, which I assume will allow us to trade some of the ratio advantage for faster perf.

TODO

  • Use the new function as the estimator for polygonToCellsExperimental tests
  • Add additional tests to cover error conditions

@dfellis
Copy link
Collaborator

dfellis commented Nov 29, 2023

However, it's still very fast - calculating max size for all 1619 country polygons at res 15 takes 469ms.

You know you're saying that on average, calculating the max size takes about 290 nanoseconds. I just can't see anyone saying they don't have 290ns of time to spare, myself. ;)

@dfellis
Copy link
Collaborator

dfellis commented Nov 29, 2023

The polygonToCells tests are what's failing, btw

I think we hardwired some of the maxPolygonToCellsSize outputs in those tests, so probably not a big deal. ;)

@nrabinowitz
Copy link
Collaborator Author

The polygonToCells tests are what's failing, btw

I think we hardwired some of the maxPolygonToCellsSize outputs in those tests, so probably not a big deal. ;)

About to look, but I'll bet the failure has to do with adding the new polyfill mode - test issue, not code issue

@nrabinowitz
Copy link
Collaborator Author

However, it's still very fast - calculating max size for all 1619 country polygons at res 15 takes 469ms.

You know you're saying that on average, calculating the max size takes about 290 nanoseconds. I just can't see anyone saying they don't have 290ns of time to spare, myself. ;)

Agree, but it can add up in a hot loop, and there are probably some pathological cases (e.g. polygons with millions of vertexes). Those might be slow for the existing estimator too though.

@dfellis
Copy link
Collaborator

dfellis commented Nov 29, 2023

However, it's still very fast - calculating max size for all 1619 country polygons at res 15 takes 469ms.

You know you're saying that on average, calculating the max size takes about 290 nanoseconds. I just can't see anyone saying they don't have 290ns of time to spare, myself. ;)

Agree, but it can add up in a hot loop, and there are probably some pathological cases (e.g. polygons with millions of vertexes). Those might be slow for the existing estimator too though.

Yes, but when you were talking about the current algo being constant-time, it's only that after the bounding box has been determined, which is O(n) on vertex count. Presumably the pathological case on vertex count would be similar between the two?

But also, if we are keeping the maxPolygonToCellsSize implementation to approximately target resolution - 3, we did already discuss that it's on the order of ~0.2% the total runtime of polygonToCells and therefore in the rounding error range for measurement's sake. I just don't see any situation where maxPolygonToCellsSize is in a hot loop without polygonToCells, so your improvements there more than make up for the marginal increase in runtime here, I think?

@dfellis
Copy link
Collaborator

dfellis commented Nov 29, 2023

The polygonToCells tests are what's failing, btw
I think we hardwired some of the maxPolygonToCellsSize outputs in those tests, so probably not a big deal. ;)

About to look, but I'll bet the failure has to do with adding the new polyfill mode - test issue, not code issue

Hmm... But it should be backwards compatible, right? We already required a mode that was forced to be zero with V4, so the actual function call signature and behavior should not have changed. It should only be things like maxPolygonToCellsSize producing a different estimate and polygonToCells outputting cells in a different order from before. But I don't think we tested explicit order before, right? That's why I suspect maxPolygonToCellsSize (especially since we aren't changing polygonToCells in this PR, either).

@nrabinowitz
Copy link
Collaborator Author

The polygonToCells tests are what's failing, btw
I think we hardwired some of the maxPolygonToCellsSize outputs in those tests, so probably not a big deal. ;)

About to look, but I'll bet the failure has to do with adding the new polyfill mode - test issue, not code issue

Hmm... But it should be backwards compatible, right? We already required a mode that was forced to be zero with V4, so the actual function call signature and behavior should not have changed. It should only be things like maxPolygonToCellsSize producing a different estimate and polygonToCells outputting cells in a different order from before. But I don't think we tested explicit order before, right? That's why I suspect maxPolygonToCellsSize (especially since we aren't changing polygonToCells in this PR, either).

The tests still use the old estimator - it was the way we were checking for invalid modes in the tests (using 3 instead of CONTAINMENT_INVALID as the first invalid mode).

@nrabinowitz
Copy link
Collaborator Author

However, it's still very fast - calculating max size for all 1619 country polygons at res 15 takes 469ms.

You know you're saying that on average, calculating the max size takes about 290 nanoseconds. I just can't see anyone saying they don't have 290ns of time to spare, myself. ;)

Agree, but it can add up in a hot loop, and there are probably some pathological cases (e.g. polygons with millions of vertexes). Those might be slow for the existing estimator too though.

Yes, but when you were talking about the current algo being constant-time, it's only that after the bounding box has been determined, which is O(n) on vertex count. Presumably the pathological case on vertex count would be similar between the two?

But also, if we are keeping the maxPolygonToCellsSize implementation to approximately target resolution - 3, we did already discuss that it's on the order of ~0.2% the total runtime of polygonToCells and therefore in the rounding error range for measurement's sake. I just don't see any situation where maxPolygonToCellsSize is in a hot loop without polygonToCells, so your improvements there more than make up for the marginal increase in runtime here, I think?

I think the bigger concern here is that estimator + runtime could remove some of the speed advantage of the new algo. I dropped the threshold to 10 and as expected it improved perf at the cost of accuracy, but as it's still better than the old version I think that's worthwhile:

Res 0	-- maxPolygonToCellsSize_AllCountries1: 3434.910000 microseconds per iteration (100 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 7560.140000 microseconds per iteration (100 iterations)
Res 1	-- maxPolygonToCellsSize_AllCountries1: 5232.460000 microseconds per iteration (50 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 21720.200000 microseconds per iteration (50 iterations)
Res 2	-- maxPolygonToCellsSize_AllCountries1: 3387.454545 microseconds per iteration (33 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 40174.454545 microseconds per iteration (33 iterations)
Res 3	-- maxPolygonToCellsSize_AllCountries1: 5369.400000 microseconds per iteration (25 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 60000.520000 microseconds per iteration (25 iterations)
Res 4	-- maxPolygonToCellsSize_AllCountries1: 3463.150000 microseconds per iteration (20 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 77886.200000 microseconds per iteration (20 iterations)
Res 5	-- maxPolygonToCellsSize_AllCountries1: 5596.500000 microseconds per iteration (16 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 98886.000000 microseconds per iteration (16 iterations)
Res 6	-- maxPolygonToCellsSize_AllCountries1: 3696.714286 microseconds per iteration (14 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 121430.928571 microseconds per iteration (14 iterations)
Res 7	-- maxPolygonToCellsSize_AllCountries1: 5667.166667 microseconds per iteration (12 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 137217.416667 microseconds per iteration (12 iterations)
Res 8	-- maxPolygonToCellsSize_AllCountries1: 3768.727273 microseconds per iteration (11 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139207.363636 microseconds per iteration (11 iterations)
Res 9	-- maxPolygonToCellsSize_AllCountries1: 5959.700000 microseconds per iteration (10 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 140786.500000 microseconds per iteration (10 iterations)
Res 10	-- maxPolygonToCellsSize_AllCountries1: 3861.888889 microseconds per iteration (9 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139234.666667 microseconds per iteration (9 iterations)
Res 11	-- maxPolygonToCellsSize_AllCountries1: 6060.250000 microseconds per iteration (8 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 141060.375000 microseconds per iteration (8 iterations)
Res 12	-- maxPolygonToCellsSize_AllCountries1: 4089.428571 microseconds per iteration (7 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 141696.285714 microseconds per iteration (7 iterations)
Res 13	-- maxPolygonToCellsSize_AllCountries1: 6343.285714 microseconds per iteration (7 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139364.000000 microseconds per iteration (7 iterations)
Res 14	-- maxPolygonToCellsSize_AllCountries1: 4170.000000 microseconds per iteration (6 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139915.666667 microseconds per iteration (6 iterations)
Res 15	-- maxPolygonToCellsSize_AllCountries1: 6595.500000 microseconds per iteration (6 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139564.333333 microseconds per iteration (6 iterations)
image image image

src/h3lib/lib/polyfill.c Outdated Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Nov 29, 2023

Coverage Status

coverage: 98.822% (+0.02%) from 98.806%
when pulling c5556bb on nrabinowitz:max-polyfill-size
into 9af7218 on uber:master.

@dfellis
Copy link
Collaborator

dfellis commented Nov 29, 2023

However, it's still very fast - calculating max size for all 1619 country polygons at res 15 takes 469ms.

You know you're saying that on average, calculating the max size takes about 290 nanoseconds. I just can't see anyone saying they don't have 290ns of time to spare, myself. ;)

Agree, but it can add up in a hot loop, and there are probably some pathological cases (e.g. polygons with millions of vertexes). Those might be slow for the existing estimator too though.

Yes, but when you were talking about the current algo being constant-time, it's only that after the bounding box has been determined, which is O(n) on vertex count. Presumably the pathological case on vertex count would be similar between the two?
But also, if we are keeping the maxPolygonToCellsSize implementation to approximately target resolution - 3, we did already discuss that it's on the order of ~0.2% the total runtime of polygonToCells and therefore in the rounding error range for measurement's sake. I just don't see any situation where maxPolygonToCellsSize is in a hot loop without polygonToCells, so your improvements there more than make up for the marginal increase in runtime here, I think?

I think the bigger concern here is that estimator + runtime could remove some of the speed advantage of the new algo. I dropped the threshold to 10 and as expected it improved perf at the cost of accuracy, but as it's still better than the old version I think that's worthwhile:

Res 0	-- maxPolygonToCellsSize_AllCountries1: 3434.910000 microseconds per iteration (100 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 7560.140000 microseconds per iteration (100 iterations)
Res 1	-- maxPolygonToCellsSize_AllCountries1: 5232.460000 microseconds per iteration (50 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 21720.200000 microseconds per iteration (50 iterations)
Res 2	-- maxPolygonToCellsSize_AllCountries1: 3387.454545 microseconds per iteration (33 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 40174.454545 microseconds per iteration (33 iterations)
Res 3	-- maxPolygonToCellsSize_AllCountries1: 5369.400000 microseconds per iteration (25 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 60000.520000 microseconds per iteration (25 iterations)
Res 4	-- maxPolygonToCellsSize_AllCountries1: 3463.150000 microseconds per iteration (20 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 77886.200000 microseconds per iteration (20 iterations)
Res 5	-- maxPolygonToCellsSize_AllCountries1: 5596.500000 microseconds per iteration (16 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 98886.000000 microseconds per iteration (16 iterations)
Res 6	-- maxPolygonToCellsSize_AllCountries1: 3696.714286 microseconds per iteration (14 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 121430.928571 microseconds per iteration (14 iterations)
Res 7	-- maxPolygonToCellsSize_AllCountries1: 5667.166667 microseconds per iteration (12 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 137217.416667 microseconds per iteration (12 iterations)
Res 8	-- maxPolygonToCellsSize_AllCountries1: 3768.727273 microseconds per iteration (11 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139207.363636 microseconds per iteration (11 iterations)
Res 9	-- maxPolygonToCellsSize_AllCountries1: 5959.700000 microseconds per iteration (10 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 140786.500000 microseconds per iteration (10 iterations)
Res 10	-- maxPolygonToCellsSize_AllCountries1: 3861.888889 microseconds per iteration (9 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139234.666667 microseconds per iteration (9 iterations)
Res 11	-- maxPolygonToCellsSize_AllCountries1: 6060.250000 microseconds per iteration (8 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 141060.375000 microseconds per iteration (8 iterations)
Res 12	-- maxPolygonToCellsSize_AllCountries1: 4089.428571 microseconds per iteration (7 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 141696.285714 microseconds per iteration (7 iterations)
Res 13	-- maxPolygonToCellsSize_AllCountries1: 6343.285714 microseconds per iteration (7 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139364.000000 microseconds per iteration (7 iterations)
Res 14	-- maxPolygonToCellsSize_AllCountries1: 4170.000000 microseconds per iteration (6 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139915.666667 microseconds per iteration (6 iterations)
Res 15	-- maxPolygonToCellsSize_AllCountries1: 6595.500000 microseconds per iteration (6 iterations)
	-- maxPolygonToCellsSize_AllCountries2: 139564.333333 microseconds per iteration (6 iterations)

image image image

On this, I'm fine with reducing the max size, but the fair comparison would be old algo + old estimator vs new algo + new estimator. The old algo was coupled with a lot of false positives to deal with, right?

@nrabinowitz
Copy link
Collaborator Author

On this, I'm fine with reducing the max size, but the fair comparison would be old algo + old estimator vs new algo + new estimator. The old algo was coupled with a lot of false positives to deal with, right?

Yes, I did - I'm updating the benchmark in this PR. With the new estimator, the new algo is slower by res 4:

Res 0	-- polygonToCells_AllCountries1: 266660.600000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries2: 53133.200000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries3: 343897.200000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries4: 439753.400000 microseconds per iteration (5 iterations)
Res 1	-- polygonToCells_AllCountries1: 447154.200000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries2: 58717.200000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries3: 130033.200000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries4: 164104.600000 microseconds per iteration (5 iterations)
Res 2	-- polygonToCells_AllCountries1: 388039.000000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries2: 171521.200000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries3: 282179.000000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries4: 321938.800000 microseconds per iteration (5 iterations)
Res 3	-- polygonToCells_AllCountries1: 878200.600000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries2: 539694.600000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries3: 770671.600000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries4: 912821.800000 microseconds per iteration (5 iterations)
Res 4	-- polygonToCells_AllCountries1: 2130698.000000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries2: 2233032.000000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries3: 2870595.000000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries4: 3706956.400000 microseconds per iteration (5 iterations)
Res 5	-- polygonToCells_AllCountries1: 10037822.600000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries2: 12237428.800000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries3: 14811909.000000 microseconds per iteration (5 iterations)
	-- polygonToCells_AllCountries4: 23855347.400000 microseconds per iteration (5 iterations)

Compare with the previous results, which didn't include either estimator.

I think this is probably an acceptable tradeoff - we may not need/want to use the estimator at all once we have streaming, and the old estimator was actually wrong for the new modes, and possibly for some polar areas. But it's not no-cost.

@@ -48,7 +48,8 @@ typedef enum {
CONTAINMENT_CENTER = 0, ///< Cell center is contained in the shape
CONTAINMENT_FULL = 1, ///< Cell is fully contained in the shape
CONTAINMENT_OVERLAPPING = 2, ///< Cell overlaps the shape at any point
CONTAINMENT_INVALID = 3 ///< This mode is invalid and should not be used
CONTAINMENT_OVERLAPPING_BBOX = 3, ///< Cell bounding box overlaps shape
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this purely an internal detail, or should be exposed to the user?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think users could potentially benefit from this - it's a faster (2-3x) version of CONTAINMENT_OVERLAPPING that may contain false positives, but no false negatives. Not appropriate for most use cases, but some users might find it useful.

return iter.error;
}

static int MAX_SIZE_CELL_THRESHOLD = 10;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per @dfellis - we could potentially allow the caller to configure this value, by supplying it in the flags. The caller could give either a specific value, or an "order of magitude" value (exponent of 10) that could make this more accurate.

Co-authored-by: David Ellis <[email protected]>
@nrabinowitz nrabinowitz merged commit a61a464 into uber:master Dec 4, 2023
33 checks passed
@nrabinowitz nrabinowitz deleted the max-polyfill-size branch December 4, 2023 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants